Design of efficient classifier integration and performance evaluation in machine learning

نویسنده

  • Manoj Kumar Singh
چکیده

Characteristics of any classifier heavily depend upon the nature of data set taken for training and verification. Area of app lications like health care suffered from having the large and suitable dataset. Classifier designed for health care should show a better generalization and robustness characteristics so that end results presented by classifier can consider with high reliability and confidence. In this paper consistency problem associated with classifier has presented, which is a big issue from practical point of view. Defining committee of experts is one of natural way to increase the reliability in classifier design but at the same time, way of integration rules the end performance. To overcome problem of generalization and consistency of classifier, two methods for developing the mixture of classifier namely TMQD and MVFD are presented. Estimation of quality associated with a classifier is very challenging task for researcher, because there is no single parameter which could alone represents the absolute performance .To measure the quality of classifier rather than having the conventional parameters like sensitivity and specificity, receiver operating characteristics is always a better choice. But in practical environment of health care use of ROC hardly has seen. In this paper detail understanding of ROC and estimation of area under curve has also presented. Selection of threshold value is one of the most important factor to determine the performance of classifier. Dependency of threshold value with population and geographical area making difficult to decide a optimal value. A graphical approach has presented to select the best threshold value as according to environment and need. Index Terms – Data Mining, Classifier, Classifier integration, ROC, Area under ROC, Sensitivity, Specifity, Heart Diseases, Neural Networks, .

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fault diagnosis in a distillation column using a support vector machine based classifier

Fault diagnosis has always been an essential aspect of control system design. This is necessary due to the growing demand for increased performance and safety of industrial systems is discussed. Support vector machine classifier is a new technique based on statistical learning theory and is designed to reduce structural bias. Support vector machine classification in many applications in v...

متن کامل

Application of ensemble learning techniques to model the atmospheric concentration of SO2

In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...

متن کامل

Verification of unemployment benefits’ claims using Classifier Combination method

Unemployment insurance is one of the most popular insurance types in the modern world. The Social Security Organization is responsible for checking the unemployment benefits of individuals supported by unemployment insurance. Hand-crafted evaluation of unemployment claims requires a big deal of time and money. Data mining and machine learning as two efficient tools for data analysis can assist ...

متن کامل

Emotion Detection in Persian Text; A Machine Learning Model

This study aimed to develop a computational model for recognition of emotion in Persian text as a supervised machine learning problem. We considered Pluthchik emotion model as supervised learning criteria and Support Vector Machine (SVM) as baseline classifier. We also used NRC lexicon and contextual features as training data and components of the model. One hundred selected texts including pol...

متن کامل

ارتقای کیفیت دسته‌بندی متون با استفاده از کمیته‌ دسته‌بند دو سطحی

Nowadays, the automated text classification has witnessed special importance due to the increasing availability of documents in digital form and ensuing need to organize them. Although this problem is in the Information Retrieval (IR) field, the dominant approach is based on machine learning techniques. Approaches based on classifier committees have shown a better performance than the others. I...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012